Large-sample tests of extreme- value dependence for 

multivariate copulas 



Ivan Kojadinovic 

Laboratoire de mathematiques et applications, UMR CNRS 5142 
Universite de Pau et des Pays de I'Adour 
B.P. 1155, 64013 Pau Cedex, France 
ivan . ko j adinovic@univ-pau . f r 

Johan Segers 

Institut de statistique, biostatistique et sciences actuarielles 

Universite catholique de Louvain 
Voie du Roman Pays 20, B-1348 Louvain-la-Neuve, Belgium 
j ohan . segersSuclouvain . be 

Jun Yan 

Department of Statistics 
University of Connecticut, 215 Glenbrook Rd. U-4120 
Storrs, CT 06269, USA 
j un . yanOuconn . edu 

Abstract 

Starting from the characterization of extreme-value copulas based on max- 
stability, large-sample tests of extreme-value dependence for multivariate copulas 
are studied. The two key ingredients of the proposed tests are the empirical copula 
of the data and a multiplier technique for obtaining approximate p-values for the 
derived statistics. The asymptotic validity of the multiplier approach is established, 
and the finite-sample performance of a large number of candidate test statistics is 
studied through extensive Monte Carlo experiments for data sets of dimension two 
to five. In the bivariate case, the rejec tion rates of t h e bes t versions of the tests 
are compared with t hose of the test of Ghoudi et al. ( 19981 ) recently revisited by 



Ben Ghorbal et al.l (|2n09h . The proposed procedures are illustrated on bivariate 



financial data and trivariate geological data. 

Keywords: max-stability, multiplier central limit theorem, pseudo-observations, 
ranks. 
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1 Introduction 



Let X be a ci-dimensional random vector with continuous marginal cumula tive di s tribu- 
tion functions (c.d.f.s) Fi, . . . , F^. It is then well-known from the work of ISklarl (119591 ) 
that the c.d.f. F oi X can be written in a unique way as 



F{x) = C{F^{x,] 



Fd{xd)}, xe 



where the function C : [0, l]'' — )■ [0, 1] is a copula and can be regarded as capturing the 
dependence between the components of X. If, additionally, C is max-stable, i.e., if 



c{u) = {cK 



1/r 



U 



1/r 



Vn G [0,1]^ 



Vr > 0, 



(1) 



the function C is an extreme-value copula. Such copulas a rise in the limiting joint dis- 
tributions of suitablv normalized componentwise maxima (IGalambod. 119871: Gudendorf 



and Segers. I2010II and are the subject of increasing practical interest in finance fMcNe il 



et al. . 120051 ). insurance (jPrees and Valde j . Il998l ) and hydrology (jSalvadori et alTboOTl ). 



Given a random sample Xi, . . . , X„ from c.d.f. C{Fi(xi), . . . , Fd{xd)}, it is of interest 
in many applications to test whether the unknown copula C belongs to the class of 
ext reme-value copulas. A first solution to this problem was proposed in the bivariate case 
by iGhoudi et al.l (Il998l ) who derived a test based on the bivariate probability integral 



transformation. The suggested approach was recently improved by iBen Ghorbal et al 



(120091 ) who investigated the finite-sample performance of three versions of the test. 

The aim of this paper is to study tests of extreme-value dependence for multivariate 
copulas based on characterization ([1]). The first key element of the proposed approach 
is the empirical copula of the data which is a nonparametric estimator of the true un- 
known copula. Starting from characterization ([1]), the empirical copula can be used to 
derive natural classes of empirical processes for testing max-stability. As the distribution 
of these processes is unwieldy, one has to resort to a multiplier technique to compute 
approximate p- values for candidate test statistics. This is the second key element of the 
proposed a pproach and is based on the seminal work of IScailletl (l2005n and Remillard 
and Scaillet (12009^ revisited recently in ISegerd (120111 ). The outcome of this work is a 
general procedure for testing extreme-value dependence which, in principle, can be used 
in any dimension. 

The second section of the paper is dev oted to recent results on the weak convergence of 
the empirical copula process obtained in ISegerd (1201 ll ). A detailed and rigorous descrip- 
tion of the proposed tests is given in Section [3], while their implementation is discussed in 
Section m In the fifth section, the results of an extensive Monte Carlo study are partially 
reported. They are used to provide recommendations in Section [6] enabling the proposed 
approach to be safely used to test extreme-value dependence in data sets of dimension 
two to five. The test based on one of the best performing statistics is finally used to 
test bi variate extreme-value dependence in the well-known insurance data of Frees and 
V aldez (Il998h. and trivariat e extreme- value dependence in the uranium exploration data 
of ICook and JohnsonI (119861 ). 

The following notational conventions are adopted in the sequel. The arrow de- 
notes weak convergence in the sense of Definition 1.3.3 in Ivan der Vaart and Wellner 
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(|2000|), and £°°([0, 1]*^) represents the space of all bounded real- valued functions on [0, 1]'^ 
equipped with the uniform metric. Also, for any u G [0, 1]*^ and any r > 0, we adopt the 



notation 



l5 • • 



Furthermore, the set of extreme- value copulas, i.e., copulas 
satisfying ([T]), is denoted by £V. 

Note finally that all the te s ts stu died in this work are implemented in the R package 
copula (iKoiadinovic and Yarj. 120101) available on the Comprehensive R Archive Network 
(1R Development Core Teaml . 120111 ). 



2 Weak convergence of the empirical copula process 



ii, 



Uid), i G {1, . . . ,n}, be pseudo-observations from the copula C com- 



Let Ui 

puted from the data by Uij = Rij/ (n+l), where Rij is the rank of Xij among Xij, 
The pseudo-observations can equivalently be rewritten as Uij = nFj{Xij)/ {n + 1), where 
Fj is the empirical c.d.f. computed from Xij, . . . , Xnj, and where the scaling factor 
n/{n + 1) is classically introduced to avoid problems at the b oundary o f [0, 1 ]*^. The 



proposed tests are based on the empirical copula of the data (iDeheuvelsl . 11979 



19811) 



which is usually defined as the empirical c.d.f. computed from the pseudo-observations, 
i.e., 



CM 



n 



< u) 



u e [0, 1]' 



For any j E {1, . . . , d}, let C^^^{u) be the partial derivative of C with respect to its 
jth argument at u, i.e., 

C{ui, . . .,Uj_i,Uj + h,Uj+i, ...,Ud)- C{u) 



lim 

ii,+h6[0,ll 



h 



ne [0,1]^ 



It is well-known (see e.g.. iNelsenl . l2006l . Theorem 2.2.7) that C^^ exists almost everywhere 
on [0, 1]'=' and that, for those u G [0, 1]"^ for which it exists, < Cb](u) < 1. If Cb](u) 
exists and is continuo us on [0. 1]^ for all ?' G i 1. .... d\. then, from CoroUarv 5.3 of van 
der Vaart and Wellner fl2007l) fsee alsolStutel. ll984J: iGanssler and Stutel. 119871: Fermanian 



et al. . 120041 : iTsukaharal . I2OO5I ). the empirical copula process C„ = ^/n(Cr, — C) converges 
weakly in i°°{[0, 1^) to the tight centered Gaussian process 



«H-5^d^lHa(l,...,l, 



Uj, 1, 



u G [0, 1]' 



(2) 



where a is a C-Brownian bridge, i.e., a tight centered Gaussian process on [0, 1]"^ with 
covariance function E[a{u)a{v)] = C{u Av) — C{u)C{v), u,v E [0, l]''. Without loss of 
generality, we assume in the sequel that a has continuous sample paths. 

For many copula families however, the partial derivatives C^^\ j G |1, . . . , (ij , fail to 
be continuous on the whole of [0,1]°'. For instance, as shown in ISegerd (120111 ). many 
popular bivariate extreme- value copulas ha ve disc o ntinuo us partial derivatives at (0, 0) 
and (1,1). To deal with such situations, ISegersI (120111 ) considered the following less 
restrictive condition: 
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(C) for any j G {1, . . . , d}, C'-^' exists and is continuous on the set Vj = {u E [0, l]'^ : < 
Uj < 1}. 



Under Condition (C), for any j G {1, . . . ,d}, ISegerd (120 111 ) extended the domain of C^'^ 
to the whole of [0, l]'^ by setting 

C{ui, . . . , Uj-i, h, Uj+i, ■ ■ ■,Ud) 



u 



hm sup 

h\.0 



h 



if n G [0, 1]"', Uj = 0, 



hmsup ^^^^'^^^^'--^^-^'^"^'^^-^^'--^^\ ifnG[0,in«, = l, 

hio h 



which ensures that the process C defined in ([2]) is well-defined on the whole of [0, 1]*^, 
and showed the weak converge nce of t he ern pirical copula process C„ to C in £°°([0, l]*^). 
Condition (C) was verified in ISegerd fl201ll ) for many popular families including many 
(i- dimensional extreme-value copulas. 



3 Description of the tests 

Starting from characterization ([T]) and having at hand a nonparametric copula estimator 
such as Cn, it seems natural to base tests of the hypothesis Hq : C & £V on processes of 
the form 

D,,„(n) = [{Cni^U^'')}" - Cn{u)\ , ue[0, l]", (3) 

where r > 0. Alternatively, since characterization ([T]) can equivalently be rewritten as 

{C{u)y = C{u^), VitG [0,1]^ Vr>0, 
one could also consider test processes of the form 

Er,n{u) = [Cn{u') - {CMY] , U E [0, 1]'^. 

For a given value of r, such processes can be used to test the hypothesis 

Ho,r : C{u) = {C{u^/'')Y G [0, l]f 

Since Hq = f]ryo^o,r, testing ifo,r for a fixed value of r is clearly not equivalent to 
testing Hq. It follows that tests based on D.f.^„ or E.r^„, with r fixed, will only be consistent 
for copula alternatives for which there exists u G [0, 1]'^ such that C{u) ^ {C{u^^^)Y . 

In our Monte Carlo experiments, values of r smaller than one did not lead to well- 
behaved tests. Besides, the processes ©^.n always led to consistently more powerful tests 
than the processes Er,„. For the sake of brevity, we therefore only present the derivation 
of the tests based on D^.n with r > 1. 

The following result, whose short proof is given in Appendix [XJ gives the asymptotic 
behavior of the test process ([3]) under Ho r- 

Proposition 1. Suppose that the partial derivatives of C satisfy Condition (C), and let 
r > 1. Then, under Hq ^, 

D,,„(w) = r{C{u^'')y-^C{u^'^) - C(n) (4) 

m £°°([0, l]"^). 
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Before suggesting two candidate test statistics based on Dr-,n, let us first explain how, 
for large n, approximate indepen dent cop i es of D r can be obtained by means o f a mul tiplier 
technique initially propose d in IScailletl ( 120051 ) and iRemillard and Scailletl (120091 ) . and 
recently revisited in ISegersI ( 1201 ll ). 

As can be seen from to obtain approximate independent copies of 3^, it is nec- 
essary to obtain approximate independent copies of C. To estimate the unknown partial 
derivative C^^\ j E {1, . . . ,d}, appearing in the expression of C given in ([2]), we use the 
estimator defined by 



{C„(mi, . . . , Uj„i, Uj+i, ...,Ud) 

- C„(mi, . . . , Uj_i, U~„, Uj+i, ...,Ud)} 



ue[0,iY, (5) 



where = {uj + n ^Z^) A 1, and u^^ = (uj — n ^^'^) V 0. 



This estimator differs slightly from the one initially proposed in IRemillard and Scaillet 



( 20091 ) ■ It has the advantage of converging in probability to C^-'l uniformly over [0, 1]'^ if 
Ct-'l happens to be continuous on [0, l]*^ instead of only satisfying Condition (C). This 
point is discussed in more detail in Appendix O 



(fc) 



Let us now introduce additional notation. Let be a large integer and let Z,^ 
i G {1, . . . , n}, k G {1, . . . , A^}, be i.i.d. random variables with mean and variance 1 
satisfying /o°°{Pr(|zf ^1 > x)y/Mx < oo, and independent of the data Xi, . . . , X^- For 
any k E {1, . . . , N} and any u E [0, l]'^, let 



^ Y: [m. <u)- c^iu)] = ± YS^f^ - z^'')m < 

* i=l ^ 1=1 



where Z'^^'^ = n ^ SILi ^f^^ • Furthermore, for any k E {1, . . . , A^} and any u E [0, l]*^, let 



and let 



C«(n) 



a. 



The following result, whose proof is given in Appendix 13 is at the root of the proposed 
class of tests. 

Proposition 2. Suppose that the partial derivatives of C satisfy Condition (C), and let 
r > 1. Then, under Hor, 



in {£°°([0, 1]'^)}*^^+^^ where . . . , Dr^"* are independent copies of the process 3r defined 
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As candidate test statistics, we consider Cramer-von Mises functionals of the form 
Sr,n= [ {Br,nin)ydu, and T,,„ = / {©,.„(n)}2daH. 

^[0,1]'' "'[0,1]'* 

Also, for any k E {1, . . . , N}, let 

= / H'lHVdu, and = / {D(5(n)}MC.(n). 

The following key result is proved in Appendix [Bl 

Proposition 3. Suppose that the partial derivatives of C satisfy Condition (C ), and let 
r > 1. Then, under -ffo.r; 

(Q C(l) C{N)\ (a 0(1) c{N)\ 

and 

(r t(^) tWi (t t(i) T^^A 

in [0, oo)^^'^^\ where 

Sr= I {'Dr{u)Ydu and Tr = f {©^(n)}2dC(n) 

are the weak limits of Sr,n and Tr^n, respectively, and Si^\ . . . , Si^^ and Tr^\ . . . , Tr^-* are 
independent copies of Sr and Tr, respectively. 

The previous results suggest computing approximate p-values for Sr,n and Tr,„ as 

TV N 

^ E 1 (^S > Sr,n) and - E 1 (T W > T.,„) , 

k=l k=l 

respectively. 

Notice that, when is not true, the processes Dr|^2, k G {1,...,A^}, cannot 
be regarded anymore as approximate independent copies of under Hq j. because 
Xi, . . . , Xn is not anymore a random sample from a c.d.f. C{Fi{xi), . . . , Fd{xd)}, where 
C satisfies C{u) = {C{u^^^')Y for all u G [0, l]'^. This however does not affect the con- 
sistency of the procedure with respect to the hypothesis -ffo,r- Indeed, the process D^^n 
can be decomposed as 

- V^{Cn{u) - C{u)} + [{Ciu'/^Y^- - C{u)] , ue[0, If. 

Whether r is false or not, provided Condition (C) is satisfied, the first and second term 
will jointly converge weakly to the limit established in the proof of Proposition [T] (see ([7]) 
in Appendix E|) . while, if ifo,r is false, 

sup ^/n \{C{u^/')Y -C{u)\ oo. 

iiG[0,l]'* 
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On the other hand, from the proof of Proposition O it is easy to verify that, provided 
Condition (C) is satisfied, and whether ifo,r is false or not, 

in {£°°([0, l]'^)}^, where lD)i^\ . . . ,3^^^ are independent copies of the process D^, defined 
in dl]). It follows that the statistics Sr,n and Tj.,„, as any sensible statistic derived from 
the process B>r,n, will be consistent with respect to the hypothesis -ffo,r- 

To improve the sensitivity of the proposed tests, given p reals ri, . . . , rp > 1, we also 
consider tests based on statistics of the form 

p p 

Sri,...,rp,n ^ ^ ^ri,n and -^ri,...,rp,n ^ ^ '-^ri,n- (6) 

i=l i=l 

4 Implementation of the tests based on Sj- fi and Tf 

We first discuss the implementation of the test based on S'r,„. The implementation of the 
test based on T^^n follows immediately after a simple modification. 

Given a large integer m > 0, we proceed by numerical approximation based on a grid 
of m uniformly spaced points on (0, 1)°' denoted . . . , w^- Then, 

and, for any A; G {1, . . . , N}, 

To efficiently implement the test, first notice that, for any A; G {1, . . . , A^}, Cn^ can be 
conveniently written as 

= ^ ^ - z^'^) < - < «,)| , ue [0, 1]'^. 

It follows that the ]D)i^n{wj) can be expressed as 

1 " 
^ i=i 

in terms of a n x m matrix M„ whose (z, j)-element is 

I 1=1 



1=1 
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In order to carry out the test based on Sr,n, it is first necessary to compute the nxm matrix 
M„. Then, to compute 5*^^^ it suffices to generate n i.i.d. random variates z[''\ . . . , Zn^ 
with expectation 0, variance 1, satisfying {Pt{\zI''^\ > x)}^/^dx < oo, and perform 

(k) 

simple arithmetic operations involving the centered Z^ and the columns of matrix Mn- 

(k) 

In the Monte Carlo simulations to be presented in the next section, the Z^ are taken 
from the standard normal distribution. 

For the test based on Tr^n, clearly, 

Tr,n = - J2{^r,nmr and T^J = - A: G {1, . . . , iV}. 

i=i i=i 
Expressions for implementing the test then immediately follow from those given for Sr^n 
and Si'^n'- simply replace m by ra, and Wj by Uj. 



5 Finite-sample performance 

Extensive Monte Carlo experiments were conducted to investigate the level and power of 
the tests based on Sr,n and Tj.^„ for samples of size n = 100, 200, 400 and 800. The values 
2, 3, . . . , 9 were considered for r. We also investigated the finite-sample performance of the 
tests based on the statistics S'.ri,...,rp,n and Tr-^^,,,^rp,n defined in ([H]). Approximate p-values 
for the latter tests can be obtained by proceeding as in the previous section. Several 
configurations were studied among which (ri, . . . , r^) = (2, 3, . . . , 9) and (3, 4, 5). Data 
sets of dimension two to five were generated, both from extreme-value and non extreme- 
value copulas. Given that the most frequently used bivariate exchangeable extreme-value 
copulas such as the Gumbel-Hougaard, Galambos, Hiisler-Reiss and Stu dent extreme- 



value copulas show striking similarities for a given degree of dependence (see lGenest et al. 



20 111 , for a detailed discussion of this matter), only the Gumbe l-Hougaard (GH ) and its 



asv mmetric version faGH') defined using Khoudraii's device flKhoudraiil. Il99,4 Genest 



et al.. ll998 : Liebschei, 20081 ) were used in the simulations. Given an exchangeable copula 



Ce, Khoudraji's device defines an asymmetric version of it as 

for all u G [0, 1]"' and an arbitrary choice of A = (Ai, . . . , A^) G (0, l)'^ such that Aj 7^ A^ 
for some {i,j} C {l,...,d}. If Cq is an extreme-value copula, then the same is true 
of A- Note that the asjTumetric Gumbel-Hougaard (aGH) obtaine d frona Khoudraji 's 
device is nothing else but the asymmetric logistic model introduced in iTawn (Il988l . ll990h . 



In the experiments, the parameter 6 of the asymmetric Gumbel-Hougaard was set to 4. 
In dimension two, the shape parameter vector A was taken equal to (Ai,0.95), with 
Al G {0.2,0.4,0.6,0.8}, so that data generated from the corresponding copulas display 
various degrees of asymmetry. The corresponding values for Kendall's tau are about 
0.19, 0.34, 0.48 and 0.60, respectively. In dimension three, four and five, A was set to 
(0.2,0.4,0.95), (0.2,0.4,0.6,0.95), and (0.2,0.4,0.6,0.8,0.95), respectively. 

As far as non extreme- value copulas are concerned, the Clayton (C), Frank (F), normal 
(N), t with four degrees of freedom (t), and Plackett (P) (for dimension two only) copulas 
were used in the experiments. 
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Table 1: Rejection rate (in %) of the null hypothesis in the bivariate observed 
in 1000 random samples of size n = 100, 200, 400 and 800 from the Gumbel-Hougaard 
copula (GH) and its asymmetric version (aCH) with 6 = 4 and A = (Ai, 0.95). 



Copula 


T 


Ai 




























n = 100 










n = 200 






GH 


0.25 




5.0 


5.1 


5.8 


5.4 


5.3 


4.2 


5.1 


5.4 


5.0 


5.3 




0.50 




3.9 


4.3 


4.6 


4.3 


4.9 


3.6 


4.1 


4.3 


4.0 


5.1 




0.75 




3.2 


3.2 


3.8 


3.5 


5.2 


2.1 


2.5 


2.5 


2.3 


4.9 


aGH 




0.2 


5.3 


5.5 


6.2 


5.8 


5.4 


5.9 


5.9 


6.7 


6.1 


5.8 






0.4 


4.6 


5.5 


5.9 


5.5 


5.6 


4.3 


5.1 


5.6 


5.2 


5.4 






0.6 


4.2 


4.8 


5.2 


4.8 


4.9 


4.4 


5.1 


5.4 


5.0 


5.5 






0.8 


4.7 


5.0 


5.1 


5.0 


5.0 


4.7 


5.2 


5.6 


5.3 


4.9 



n = 400 n = 800 



GH 


0.25 




4.3 


4.9 


5.1 


4.8 


5.2 


4.6 


5.1 


5.1 


5.1 


5.0 




0.50 




3.4 


3.8 


4.2 


3.9 


5.2 


4.0 


4.3 


4.6 


4.4 


5.0 




0.75 




2.4 


2.5 


2.6 


2.5 


5.3 


3.4 


3.5 


3.6 


3.6 


5.0 


aGH 




0.2 


5.4 


5.4 


6.0 


5.8 


5.9 


4.9 


5.1 


5.4 


5.3 


5.4 






0.4 


4.3 


5.1 


5.5 


5.1 


5.1 


3.3 


4.3 


4.5 


4.2 


5.4 






0.6 


4.6 


4.9 


5.0 


4.9 


5.1 


3.6 


4.0 


4.7 


4.1 


5.3 






0.8 


4.5 


4.7 


5.2 


4.8 


5.0 


4.1 


4.4 


4.6 


4.5 


4.9 



For each of the one-parameter exchangeable families considered in the study (GH, C, 
F, N, t, P), three values of the parameter were considered. These were chosen so that the 
bivariate margins of the copulas have a Kendall's tau of 0.25, 0.50, and 0.75, respectively. 

All the tests were carried out at the 5% significance level and empirical rejection rates 
were computed from 1000 random samples per scenario. For the tests based on Sr^n, the 
parameter m defined in Section |4] was set to 44^ in dimension two, 13^ in dimension three, 
in dimension four, and 5^ in dimension five. Smaller and greater values of m were also 
considered but this did not seem to affect the results much. 

In most scenarios involving extreme- value copulas, the tests turned out to be globally 
too conservative, although the agreement with the 5% level seemed to improve as n was 
increased. To attempt to improve the empirical levels of the tests for n G {100,200}, 
we considered several asymptotically negligible ways of rescaling the empirical copula in 
the expression of the test process ([3]), while keeping the expressions of the processes D^fjl, 
G {1, . . . , A^}, unchanged. Reasonably good empirical levels were obtained by replacing 
Cn in the expression of by n{n + 0.85) ~^C„. With this asymptotically negligible 
modification, the best results were obtained for r G {3, 4, 5} and for the tests based on 
Tr^ri, which consistently outperformed the tests based on Sr,n- In dimension two, the 
rejection rates of the tests based on T3^„, T4^„, Ts^^, and T^^^^^^n are reported in Tables [1] 
and El 
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Table 2: Rejection rate (in %) of the null hypothesis in the bivariate case as observed in 
1000 random samples of size n = 100, 200, 400 and 800 from the Clayton (C), Prank (F), 
normal (N), t with four degrees of freedom (t), and Plackett copula (P). 



Copula 


r 


















T3,4,5,n 












n = 100 










n = 200 






C 


0.25 


74.4 


72.2 


72.5 


73.8 


79.3 


93.3 


94.2 


94.0 


94.6 


97.9 




0.50 


99.1 


98.3 


98.2 


98.5 


99.5 


100.0 


100.0 


100.0 


100.0 


100.0 




0.75 


99.3 


99.7 


99.8 


99.9 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


F 


0.25 


38.9 


43.7 


46.4 


45.0 


22.4 


56.1 


66.2 


69.8 


66.1 


38.2 




0.50 


62.2 


68.8 


75.8 


71.3 


34.7 


88.8 


95.2 


96.5 


96.0 


60.3 




0.75 


75.0 


85.0 


89.2 


86.9 


34.7 


96.7 


98.9 


99.4 


99.0 


57.3 


N 


0.25 


26.9 


25.5 


26.2 


26.8 


22.3 


32.5 


38.4 


39.5 


38.7 


36.2 




0.50 


27.5 


28.8 


30.8 


30.8 


36.9 


44.6 


50.2 


52.8 


51.0 


63.0 




0.75 


22.0 


24.5 


26.9 


26.1 


43.9 


33.9 


46.6 


50.7 


46.7 


75.9 


P 


0.25 


35.2 


37.6 


42.6 


39.3 


21.5 


50.0 


59.0 


63.3 


59.2 


38.1 




0.50 


47.9 


54.5 


59.2 


56.3 


30.5 


71.1 


81.0 


84.8 


81.7 


61.7 




0.75 


42.5 


50.1 


56.0 


53.1 


34.6 


60.9 


76.4 


83.6 


78.4 


58.5 


t 


0.25 


15.2 


14.0 


14.4 


14.8 


15.0 


17.6 


18.7 


18.1 


18.4 


26.6 




0.50 


22.8 


22.9 


23.3 


23.9 


29.4 


31.1 


33.0 


31.4 


33.4 


52.7 




0.75 


19.2 


19.2 


20.6 


20.7 


39.9 


26.2 


33.5 


33.4 


34.2 


69.2 










n = 400 










n = 800 






c 


0.25 


99.8 


99.8 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 




0.50 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 




0.75 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


F 


0.25 


85.7 


91.5 


94.1 


92.5 


64.6 


99.3 


99.8 


99.9 


99.8 


92.6 




0.50 


99.6 


100.0 


100.0 


100.0 


87.6 


100.0 


100.0 


100.0 


100.0 


99.1 




0.75 


100.0 


100.0 


100.0 


100.0 


85.9 


100.0 


100.0 


100.0 


100.0 


99.4 


N 


0.25 


55.6 


60.7 


63.7 


61.4 


61.7 


85.9 


89.5 


90.7 


90.2 


90.0 




0.50 


72.1 


77.9 


80.0 


79.0 


90.9 


96.7 


98.2 


98.9 


98.3 


99.5 




0.75 


55.8 


70.9 


77.3 


72.8 


96.3 


94.2 


97.6 


99.0 


98.5 


100.0 


P 


0.25 


75.8 


84.0 


87.5 


85.2 


61.4 


96.8 


99.1 


99.5 


99.2 


89.5 




0.50 


95.1 


97.0 


98.8 


97.7 


87.4 


100.0 


100.0 


100.0 


100.0 


99.1 




0.75 


92.4 


97.2 


98.9 


97.9 


87.9 


99.9 


100.0 


100.0 


100.0 


99.4 


t 


0.25 


28.2 


27.5 


25.9 


28.0 


44.4 


52.1 


45.6 


AAA 


47.6 


74.1 




0.50 


55.3 


57.1 


56.2 


58.4 


84.2 


85.6 


86.7 


85.5 


87.8 


99.0 




0.75 


49.7 


56.3 


56.8 


56.8 


93.7 


87.6 


89.6 


90.8 


90.9 


100.0 
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As can be seen from Table [T], the empirical levels of the selected tests are, overall, 
reasonably close to the 5% nominal level for r G {0.25,0.5} and Ai G {0.2,0.4,0.6,0.8}, 
which, as discussed earlier, corresponds to weak to moderate dependence. The tests 
remain however too conservative when r = 0.75, although the empirical levels seems 
globally to improve as n increases. An inspection of Table [2] shows that, in terms of 
power, the tests based on T^^n and T^^n appear globally more powerful than that based 
on Ts n, although the latter sometimes outperforms the former in the case of weakly 
dependent data sets. As far as the test based on T^,A,5,n is concerned, its rejection rates 
are almost always greater than those of T4 „, and sometimes greater than those of T5 „. 

The previous tests c an be compared with t he test of extreme-value d ependence pro- 
posed by Ghoudi et al.l ( 19981 ) and improved by Ben Ghorbal et al. (2009). The rejection 
rates of the best version of that test, based on a variance estimator denoted a^, were 
computed using routines available in the copula R package, and are reported in Tabled 
The test based on o"^ is more powerful than its competitors when data are generated 
from an elliptical copula, the gain in power being particularly large for the t copula. 
The proposed tests perform better when data are generated from a Frank or a Plackett 
copula. For n = 100 and the Frank copula, the rejection rates of test based on T3 4 are 
approximately twice as great as those of the test based on o",^. From the lower right block 
of Table O we also see that, for all tests, the optimal rejection rate is almost reached in 
all scenarios not involving extreme-value copulas when n = 800. 

The rejection rates of the test based on T3^4_5^„ for data sets of dimension three, four 
and five are given in Table [31 As can be seen from the first two horizontal blocks of the 
table, in the case of weak to moderate dependence, the test appears slightly conservative, 
overall, although the agreement with the 5% level seems to improve as n increases. As in 
dimension two, the test is the most conservative in the case of strongly dependent data 
and this phenomenon increases with the dimension. Notice however that, in almost all 
scenarios under the alternative hypothesis, the power of the test increases as d increases. 
This might be due to the fact that every bivariate margin of a d-variate extreme-value 
copula must be max-stable. Hence, deviations from multivariate max-stability might be 
easier to detect as the dimension increases. Note finally that, as n reaches 800, the optimal 
rejection rate is almost attained in all scenarios not involving extreme- value copulas. 



6 Discussion and illustrations 

The results of the extensive Monte Carlo experiments partially reported in the previous 
section suggest that the test based on the statistic T3 4 5 „ can be safely used in dimension 
two or greater to assess whether data arise from an extreme-value copula. The choice 
of the statistic T3_4_5_„ is not claimed to be optimal as other candidate test statistics 
co uld be considered. I n dirn ension two, the test appears more powerful than the test 
of lBen Ghorbal et all (l2009h based on in approximately half of the scenarios under 



the alternative hypothesis, and is outperformed in the remaining scenarios. In dimension 
strictly greater than two, the proposed approach is presently, to the best of our knowledge, 
the only available procedure for testing extreme-value dependence. 

As an illustration, we first applied the test based on T3^4^5^„ to the bivariate indemnity 
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Table 3: Rejection rate (in %) of the null hypothesis for the test based on T'3,4,5,n for d = 3, 4 and 5 as observed in 1000 random 
samples of size n = 100, 200, 400 and 800 from the Gumbel-Hougaard (GH), its asymmetric version (aGH), the Clayton (C), Frank 
(F), normal (N), and the t copula with four degrees of freedom (t). The parameters of aGH are 9 = 4 and A = (0.2,0.4,0.95) in 
dimension three, A = (0.2, 0.4, 0.6, 0.95) in dimension four, and A = (0.2, 0.4, 0.6, 0.8, 0.95) in dimension five. 



true 


r 




7 

a = 


: 3 






a = 


-- 4 






7 

a = 


: 5 




100 


200 


400 


800 


100 


200 


400 


800 


100 


200 


400 


800 


GH 


0.25 


5.0 


4.9 


4.8 


4.6 


4.8 


5.0 


4.3 


4.9 


4.2 


4.5 


4.6 


4.8 




0.50 


2.8 


3.0 


3.4 


4.0 


2.3 


2.8 


3.2 


3.6 


2.2 


2.2 


2.7 


3.6 




0.75 


0.9 


1.1 


1.6 


2.4 


0.4 


0.5 


0.9 


1.8 


0.2 


0.3 


0.6 


1.0 


aGH 




5.5 


4.4 


4.4 


4.8 


4.2 


3.6 


3.9 


4.0 


3.5 


3.4 


3.0 


3.4 


C 


0.25 


91.9 


99.9 


100.0 


100.0 


98.2 


100.0 


100.0 


100.0 


98.8 


100.0 


100.0 


100.0 




0.50 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 




0.75 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


100.0 


F 


0.25 


59.0 


86.9 


99.4 


100.0 


62.2 


94.6 


99.9 


100.0 


65.6 


96.0 


100.0 


100.0 




0.50 


83.3 


98.9 


100.0 


100.0 


88.0 


99.8 


100.0 


100.0 


90.1 


99.9 


100.0 


100.0 




0.75 


91.0 


100.0 


100.0 


100.0 


89.8 


100.0 


100.0 


100.0 


89.3 


99.8 


100.0 


100.0 


N 


0.25 


35.2 


65.2 


91.3 


99.1 


46.9 


76.9 


97.5 


100.0 


52.1 


86.3 


99.1 


100.0 




0.50 


39.6 


69.9 


94.7 


99.9 


41.1 


76.8 


97.4 


100.0 


45.8 


80.7 


98.4 


100.0 




0.75 


20.7 


44.8 


85.0 


99.6 


12.9 


42.6 


85.7 


99.9 


10.6 


32.7 


84.5 


99.9 


t 


0.25 


16.9 


23.6 


42.1 


67.0 


16.7 


28.0 


50.7 


79.9 


20.3 


33.2 


58.7 


83.9 




0.50 


23.8 


44.8 


74.3 


96.9 


26.3 


48.7 


83.2 


99.0 


26.8 


53.8 


84.0 


99.8 




0.75 


12.6 


26.3 


65.6 


97.-1 


7.1 


19.8 


60.4 


98.1 


3.9 


15.3 


55.7 


96.3 



Table 4: Approximate p-values (in %) for the test based on T^^^^^^n obtained for the 
triples of va riables iU.Co.Lil. i U.Li.Til a nd iTi.Li.Csl of the uranium data of Cook 
and Johnson (119861 ) . 



Random ranks for ties 
Minimum Median Maximum Mid-ranks 



{U, Co, Li} 


0.0 


0.0 


0.1 


0.0 


{U, Li, Ti} 


0.0 


0.1 


0.3 


0.0 


{Ti, Li, Cs} 


2.5 


3.9 


5.5 


1.8 



payment and allocated loss adjustment expense data studied in iFrees and ValdezI (119981 ). 
These consist of 1466 general liability claims randomly chosen from late settlement lags 
(among the initial 1500 claims, 34 clai ms for which the po l icy lim it was reached were 
ignored). Many studies, including that of iBen Ghorbal et al.l (120091 ) . have concluded that 
an extreme-value copula is likely to provide an adequate model of the dependence. 

Note that these data contain a non negligible number of ties. As is the case for 
other procedures based on the empirical copula, the presence of ties might significantly 
affect the tests under study since these were deriv ed under the assumpt i on of continuous 
margins. To deal somehow satisfactorily with ties, iKojadinovic and YanI ( l2010l ) suggested 
to assign ranks at random in the case of ties when computing pseudo-observations. This 
was done using the R function rank with its argument ties .method set to "random". The 
test was then carried out on the resulting pseudo-observations. With the hope that the 
use of randomization will result in many different configurations for the parts of the data 
affected by ties, the test based on the pseudo-observations computed with ties .method = 
"random" was performed 100 times with = 1000. The minimum, median and maximum 
of the obtained approximate p-values are 40.7%, 45.9%, and 50.4%, respectively. If the 
pseudo-observations are computed using mid-ranks, the approximate p-value, based on 
A^ = 10 000 multipher iterations, drops down to 1.7%. As already observed in other 
situations, using mid-ranks seems to increase the evidence against the null hypothesis. 

As a second example, we considered the uranium exploration data of Cook and John- 
son (Il986h . The data consist of log-concentrations of seven chemical elements in 655 water 
samples collected near Grand Junction, Colorado: uranium (U), lithium (Li), cobal t (Co) , 
potassium (K), cesium (Cs), scandium (Sc), and titanium (Ti). iBen Ghorbal et al.l ( 120091 ) 
performed an extensive study of the 21 pairs of variables and suggested that the triples 
{U,Co,Li}, {U,Li,Ti} and {Ti,Li,Cs} should be investigated for trivariate extreme-value 
dependence once a multivariate test becomes ava ilable. Note that the nu mber of ties in 
these data is greater than in the insurance data of iFrees and Valdea (119981 ). In particular, 
the variable Li takes only 90 different values out of 655. For that reason, as previously, 
we broke the ties at random and repeated the calculations 100 times with A^ = 1000. 
Approximate p- values for the test based on Ts^A,5,n are summarized in Table |H The first 
three columns give the minimum, median and maximum of the obtained p-values. The 
last column gives the p- values computed from the mid-ranks using A^ = 10 000. As for the 
insurance data, we see that the use of mid-ranks increases the evidence against the null 
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hypothesis. Based on the randomization approach, we conclude that there is strong evi- 
dence against trivariate extreme- value dependence in the triples {U,Co,Li} and {U,Li,Ti}, 
while there is only marginal evidence against trivariate extreme- value dependence in the 
triple {Ti,Li,Cs}. 
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A Proofs of Propositions 1 and 2 



Proof of Proposition [H From the limiting behavior of the empirical copula process 
given in Section [2] a nd the functional version of Slutskv's theorem fsee e.g.. van der Vaart 
and Wellner, 20QQ , Chap. 3.9), we have that 

V^{C„(n)-C(n)} C(n) 

in |£°°([ 0, Ij'^'lP. The desired result then follows from the continuous mapping theorem 
(see e.g. Ivan der Vaart and Wellnerl . l2000l . Theorem 1.3.6). □ 



(7) 



Proof of Proposition [2], Let j G {1, . . . ,d}, and notice that 

d 



u 



u G [0,1]' 



1 



Also, l/y/n < — „ < 2/y/n for all Uj G [0, 1] and all n > 1. Hence, 

1 " 

Cli\u) < Y,l{{n+ l)uT^ < R,, <in + l)w+J , u e [0, 1]", 

* 1=1 

where Rij is the rank of Xij among Xij , . . . , Xnj ■ It follows that 

^[,1 r + ^Hn - + l)«Zn + 1 2(72 + 1) 1 

sup C„J(w) < sup = < ' 



■ue[o,i] ue[o,i] ^ 

Thus, Cli\u) < 5 for all j G {l,...,d}, all u G [0,1]'=' and all n > 1. The latter 
fact combined with Lemma [2] given in Appendix [0 stating a unifo r m co nvergence in 
probability of Cn' to C^^^ enables us to use Proposition 3.2 of Segers (2011) from which 
it follows that 

(c„,c«,...,cw)-(c,c«,...,c(^)) 

in {£°°([0, 1]'^)}'^^"''^^ where C*^^\ . . . , C*-^-* are independent copies of C. The desired result 
is then a consequence of the continuous mapping theorem and the fact that C„ converges 
uniformly in probability to C. □ 
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B Proof of Proposition 3 



In order to prove the joint weak convergence of T^^^, Tr,n , • • • , Tr!n\ we first sliow a lemma. 

Let A be the space of bounded, Borel measurable functions on [0,1]'^ and let B be the 
space of c.d.f.s of finite Borel measures on [0, l]*^. Define (p : Ax B ^ 'Rhj (f){a, b) = J adb 
and denote ||/||oo = sup^g^,!]'* \fi'^)\ / • [O5 1]'^ ~^ ^- The topologies on A and B are 
the ones induced by uniform convergence. The topology on Ax B, A^"*"^ or A^~^^ x B is 
the product topology. 

Lemma 1. The map (f) is continuous at each pair (ao, &o) of Ax B such that the functions 
ao and bo are continuous on [0, l]*^. 

Proof. Let (a„, 6„) be a sequence in AxB such that ||a„ — ao||oo and — 60II00 — ^ 0. 
We have to show that / a„d6„ — )■ / aodfeo- Let (3n and (3o be the finite Borel measures 
on [0, 1]'^ associated with the c.d.f.s 6„ and 60, respectively. By the triangle inequality. 



an dbn - ao dbo 



an dbn - ao dbn 



ao dbn - ao dk 



< 

We treat the two terms on the right-hand side of the previous inequality separately. 
First, by uniform convergence and the continuity of bo on [0, 1]"^, we have 

Pni[0, 1]') = &n(l, . . . , 1) ^ &0(1, . . . , 1) = /3o([0, 1]'^). 

As a consequence, 

j Qndbn- j aodbn < j Ictn - ctol d6„ < ||a„ - aolloo /3n([0, l]"') 0. 

Second, as — 60II00 — ^ 0, we have Pn — )■ Po in the topology of weak convergence of 
finite Borel measures. By continuity of the function ao, this implies j ao dfe^ ~^ / ^odfoo, 
as required. □ 

Proof of Proposition [31. The fact that S^^ni Sr]l, ■ ■ ■ , jointly converge weakly to 
independent copies of the same limit is an immediate consequence of Proposition [2] and 
the continuous mapping theorem. 

Let us show the corresponding result for Tr^n,Tr}n, ■ ■ ■ ■,Tt%\ Observe first that both 
A and S, defined at the beginning of this Appendix, are subsets of the space £°°([0, l]*^). 
Next, the map A^+'^ ^ A^+^ x B : {a^^\ a(^+i)) ^^ {a'-^\ a^^+^\ b) being contin- 
uous, we have, from Proposition |2] and the continuous mapping theorem, that 



(D.,„, D«, . . . , ©W, C) (D., D«, . . . , C) 



in A^^^ X B. Since ||C„ — C||oo converges to zero in probability, it follows that 

(D.,„, D«, ...MrV^ Cn) - (O., O^), . . . , DW, . 

in A^+i X B. 
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Let Aq = {a E A : a is continuous} and Bq = {b E B : b is continuous}. Copulas 
being continuous, C belongs to Bq. From Proposition [H we have that belongs to Aq 
with probability one since the same is true for C defined in ([2]). The limiting processes 
. . . ,Dr^^ also belong to Ao with probability one since they are independent copies 
of Br. 

By Lemma [H the map : A x i? — )■ M is continuous at every point of Aq x Bq. It 
follows that the map ^ : A^+'^ x B ^ defined by 

^(a«,...,a(^+^),6) = (0(a«,6),...,0(a(^+i),6)) 

is continuous at every point of Aq"^^ x Bq. From ([8]) and the continuous mapping theorem, 
we finally obtain that 

which is the desired result. □ 



C Estimators of the partial derivatives 

In order to estimate the unknown partial derivative C^^\ j E {!,..., ci}, besides the 
estima tor cl^^ defined in (l5ll. one could use the estimator proposed bv Remillard and 
Scaillet (120091 ). and defined bv 

Cn^Rsi'^) = l^^TJ^ {Cn{Ui, U^_i, Mj+i, . . . , Mrf) 

-C„(mi, . . . ,Uj-i,u~^,Uj+i, . . .,Ud)} , ue[0, if, 

where = (uj + n^^/^) A 1, and = (uj — n^^/"^) V 0. 

It is easy to verify that, for fixed < a < 6 < 1 and n sufficiently large, Cif^ and 
C^nRS coincide on {u E [0, Vf : a < Uj < b}, and hence, from Lemma [2] below, if C'-'^ is 
continuous on the set Vj defined in Condition (C), both estimators converge in probability 
to C'-'l uniformly on {u E [0, 1]^ : a < Uj < b}. 

It could be argued that the following is a desirable property of an estimator of C^^^: 
if C'-'^ happens to be continuous on [0, l]'^ instead of Vj, the estimator should converge in 
probability to C'-^l uniformly on [0, 1]'^. This property is satisfied by Cli^ as is verified in 
Lemma [2] below. It is however not satisfied by Cl^^j^g since the latter estimator does not 
converge pointwise in probability at points n of [0, 1]'^ such that uj = or uj = 1. 

Lemma 2. Let j E {l,...,d}, < a < b < 1, and assume that Condition (C) holds. 
Then, 

sup 

ue[o,i]<* 

Uj G [a, 6] 

//, additionally, the partial derivative C^'^ is continuous on [0, if, then, 

sup \Cli\u) - C^^\u)\ ^ 0. 
we[o,i]'' 
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Proof. Without loss of generality, fix j = 1, and, for any u G [0, 1]'^, let denote the 
vector (ti2, . . . , Ud) of [0, l]''"^. Also, let 5 > be a real number such that 0<6<a<b< 
1 — 5 < 1, and let n be sufficiently large such that, for any x G [a, b], x±n^^^'^ G [5, 1 — 5]. 
Now, for any u G [0, l]*^ such that Ui G [a, b], we can write 



u, „ - u 



l.n "l.n 



+ 



From the fact that u\^^ — u^^^ = 2n for all Ui G [a, b], we obtain that 



sup < sup 



ui^[a,b] 



+ sup |Cn(M^;„, - Cn(Mi^„, , (9) 

iie[o,i]'* 



ie[a,b] 



where C„ = ^/n{Cn — C). Since C^^l exists and is continuous on the set Vi = {u E [0, 1]'' : 
< Ml < 1}, from the mean value theorem, we have that 

where m* „ G C [5, 1 — 5]. It follows that 

1 



sup 

ui£[a,b] 



sup 

ui£[a,b] 



{c«„,i._i)-cKn,^-i)}-c™W 

CWK.„,ii_i)-CW(n)|< 



sup 

(u',-u)e[o,i]d+i 
u',uie[<5,i— 

|u'-ui|<n-l/2 



The term on the right converges to zero as n tends to infinity because C*!^' is uniformly 
continuous on the set {u G [0, 1]°' : 5 < Mi < 1 — 6}. 

The fact that 



Pr 



sup |C„(u^;„, - C„(ui „,'U_i)| 

iie[o,i]'' 

[a,b] 



(10) 



is a consequence of the asymptotic equicontinuity of the sequence C„, which follows from 
the weak convergence of C„, in i°°( \ 0, 1]°') to the Gaussian process with continuous paths 
C defined in (see e.g.. iKosorokl . l2008l . page 115 and Equation (2.6)). 

To prove the second statement, notice that ([9]) with a = and 6 = 1 holds for all 
n > 1 because < 1/(m^„ — m^„) < i/n for all ui G [0, 1] and all n > 1. Then, applying 
the mean value theorem as previously, one obtains that 



sup 

itG[0,l]'* 



{C(n+„,i._i)-C(«i;„,i._i)}-CW(n) 



sup |cW(Mt„,n_i)-CW(n)| < 



ue[o,i]'' 



sup 

(u',u)6[0,l]<*+l 
lu'-ui|<n-l/2 



|cW(n',w_i) -CW(n)| 
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where Ui„ G C [0, 1]. The term on the right of the previous display converges 

to zero as n tends to infinity because is uniformly continuous on [0, l]*^. The desired 
results finally follows from the fact that the result stated in f lTOj) with a = and 6 = 1 
holds for the same reasons as previously. □ 
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